{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "54f2ab19-68e1-4725-b6e7-efd8eedebe1a",
   "metadata": {},
   "source": [
    "<center><img src=https://raw.githubusercontent.com/feast-dev/feast/master/docs/assets/feast_logo.png width=400/></center>"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "69a40de4-65cf-4b45-b321-2b7ce571f8cb",
   "metadata": {},
   "source": [
    "# Credit Risk Model Training"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "fe641d83-1e28-4f7f-895c-8ca038f6cc53",
   "metadata": {},
   "source": [
    "### Introduction"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "8f04f635-401b-47b6-b807-df61d42ec752",
   "metadata": {},
   "source": [
    "AI models have played a central role in modern credit risk assessment systems. In this example, we develop a credit risk model to predict whether a future loan will be good or bad, given some context data (presumably supplied from the loan application process). We use the modeling process to demonstrate how Feast can be used to facilitate the serving of data for training and inference use-cases.\n",
    "\n",
    "In this notebook, we train our AI model. We will use the popular scikit-learn library (sklearn) to train a RandomForestClassifier, as this is a relatively easy choice for a baseline model."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "a96bf1aa-c450-4201-83a4-e25b08bdd12d",
   "metadata": {},
   "source": [
    "### Setup"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "a47b33bc-bc06-4de0-8f3a-beea8179035c",
   "metadata": {},
   "source": [
    "*The following code assumes that you have read the example README.md file, and that you have setup an environment where the code can be run. Please make sure you have addressed the prerequisite needs.*"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "id": "c66a3dab-fdbf-40be-8227-6180dc314a84",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Imports\n",
    "import warnings\n",
    "import datetime\n",
    "import feast\n",
    "import joblib\n",
    "import pandas as pd\n",
    "import seaborn as sns\n",
    "\n",
    "from feast import FeatureStore, RepoConfig\n",
    "from sklearn.model_selection import train_test_split\n",
    "from sklearn.preprocessing import OrdinalEncoder\n",
    "from sklearn.compose import ColumnTransformer\n",
    "from sklearn.pipeline import Pipeline\n",
    "from sklearn.ensemble import RandomForestClassifier\n",
    "from sklearn.metrics import classification_report"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "id": "2a841445-fa47-4826-a874-28ac0e4ea57f",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Ignore warnings\n",
    "warnings.filterwarnings(action=\"ignore\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "id": "23579727-7797-4101-a70d-b0d4c24b0fdf",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Random seed\n",
    "SEED = 142"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "fc5be519-7733-449b-8dc3-411e86371315",
   "metadata": {},
   "source": [
    "This notebook assumes that you have already done the following:\n",
    "\n",
    "1. Run the [01_Credit_Risk_Data_Prep.ipynb](01_Credit_Risk_Data_Prep.ipynb) notebook to prepare the data.\n",
    "2. Run the [02_Deploying_the_Feature_Store.ipynb](02_Deploying_the_Feature_Store.ipynb) notebook to configure the feature stores and launch the feature store servers.\n",
    "\n",
    "If you have not completed the above steps, please go back and do so before continuing. This notebook relies on the data prepared by 1, and it uses the Feast offline server stood up by 2."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "1ca99047-e508-4b1f-9f4c-f11e38587d70",
   "metadata": {},
   "source": [
    "### Load Label (Outcome) Data"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "89b49268-b7a5-4abc-8d82-1cdbf9bb4473",
   "metadata": {},
   "source": [
    "From our previous data exploration, remember that the label data represents whether the loan was classed as \"good\" (1) or \"bad\" (0). Let's pull the labels for training, as we will use them as our \"entity dataframe\" when pulling features.\n",
    "\n",
    "This is also a good time to remember that the label timestamps are lagged by 30-90 days from the context data records."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "id": "6a227a12-7b3e-462a-8f6e-38a7690df1c4",
   "metadata": {},
   "outputs": [],
   "source": [
    "labels = pd.read_parquet(\"Feature_Store/data/labels.parquet\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "id": "31a39cad-0a85-4d98-ad95-008c81bb6fe0",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>ID</th>\n",
       "      <th>class</th>\n",
       "      <th>outcome_timestamp</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>0</td>\n",
       "      <td>good</td>\n",
       "      <td>2023-11-24 22:50:13</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>1</td>\n",
       "      <td>bad</td>\n",
       "      <td>2023-11-03 12:10:13</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>2</td>\n",
       "      <td>good</td>\n",
       "      <td>2023-11-30 22:06:03</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>3</td>\n",
       "      <td>good</td>\n",
       "      <td>2023-11-17 07:37:19</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>4</td>\n",
       "      <td>bad</td>\n",
       "      <td>2023-12-01 05:01:48</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "   ID class   outcome_timestamp\n",
       "0   0  good 2023-11-24 22:50:13\n",
       "1   1   bad 2023-11-03 12:10:13\n",
       "2   2  good 2023-11-30 22:06:03\n",
       "3   3  good 2023-11-17 07:37:19\n",
       "4   4   bad 2023-12-01 05:01:48"
      ]
     },
     "execution_count": 5,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "labels.head()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "857f29fd-46d3-444b-b24f-eaccd82ab7d3",
   "metadata": {},
   "source": [
    "### Pull Feature Data from Feast Offline Store"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "07c13b69-3d26-484c-97cd-97734cc812bd",
   "metadata": {},
   "source": [
    "In order to pull feature data from the offline store, we create a FeatureStore object that connects to the offline server (continuously running in the previous notebook)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "id": "9e9828f8-f210-4586-ac36-3f7e17f4f1e8",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Create FeatureStore object\n",
    "# (connects to the offline server deployed in 02_Deploying_the_Feature_Store.ipynb) \n",
    "store = FeatureStore(config=RepoConfig(\n",
    "    project=\"loan_applications\",\n",
    "    provider=\"local\",\n",
    "    registry=\"Feature_Store/data/registry.db\",\n",
    "    offline_store={\n",
    "        \"type\": \"remote\",\n",
    "        \"host\": \"localhost\",\n",
    "        \"port\": 8815\n",
    "    },\n",
    "    entity_key_serialization_version=3\n",
    "))"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "c007e7ca-40c1-4850-abed-73b6171ad08d",
   "metadata": {},
   "source": [
    "Now, we can retrieve feature data by supplying our entity dataframe and feature specifications to the `get_historical_features` function. Note that this function performs a fuzzy lookback (\"point-in-time\") join, matching the lagged outcome timestamp to the closest application timestamp (per ID) in the context data; it also joins the \"a\" and \"b\" features that we had previously split into two tables.\n",
    "\n",
    "To keep this example simple, we will limit our feature set to the numerical features plus two categorical features."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "id": "dd2e3cb5-c865-48f4-80b6-8a14a1ff09ab",
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "WARNING:root:_list_feature_views will make breaking changes. Please use _list_batch_feature_views instead. _list_feature_views will behave like _list_all_feature_views in the future.\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Using outcome_timestamp as the event timestamp. To specify a column explicitly, please name it event_timestamp.\n"
     ]
    }
   ],
   "source": [
    "# Get feature data\n",
    "# (Joins a and b data, and selects records with the right timestamps)\n",
    "df = store.get_historical_features(\n",
    "    entity_df=labels,\n",
    "    features=[\n",
    "        \"data_a:duration\",\n",
    "        \"data_a:credit_amount\",\n",
    "        \"data_a:installment_commitment\",\n",
    "        \"data_a:checking_status\",\n",
    "        \"data_b:residence_since\",\n",
    "        \"data_b:age\",\n",
    "        \"data_b:existing_credits\",\n",
    "        \"data_b:num_dependents\",\n",
    "        \"data_b:housing\"\n",
    "    ]\n",
    ").to_df()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "id": "c72f6cb1-bbbf-4512-98cd-0abe5ff0c24b",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "<class 'pandas.core.frame.DataFrame'>\n",
      "RangeIndex: 1000 entries, 0 to 999\n",
      "Data columns (total 12 columns):\n",
      " #   Column                  Non-Null Count  Dtype              \n",
      "---  ------                  --------------  -----              \n",
      " 0   ID                      1000 non-null   int64              \n",
      " 1   class                   1000 non-null   category           \n",
      " 2   outcome_timestamp       1000 non-null   datetime64[ns, UTC]\n",
      " 3   duration                1000 non-null   int64              \n",
      " 4   credit_amount           1000 non-null   int64              \n",
      " 5   installment_commitment  1000 non-null   int64              \n",
      " 6   checking_status         1000 non-null   category           \n",
      " 7   residence_since         1000 non-null   int64              \n",
      " 8   age                     1000 non-null   int64              \n",
      " 9   existing_credits        1000 non-null   int64              \n",
      " 10  num_dependents          1000 non-null   int64              \n",
      " 11  housing                 1000 non-null   category           \n",
      "dtypes: category(3), datetime64[ns, UTC](1), int64(8)\n",
      "memory usage: 73.8 KB\n"
     ]
    }
   ],
   "source": [
    "# Check the data info\n",
    "df.info()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "110ea48c-0a5a-4642-aaba-a9eeb4a7da48",
   "metadata": {},
   "source": [
    "### Split the Data"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "f6669dce-a8b0-4d80-9a15-70b7dfd2d718",
   "metadata": {},
   "source": [
    "Next, we split the data into a `train` and `validate` set, which we will use to train and then validate a model. The validation set will allow us to more accurately assess the model's performance on data that it has not seen during the training phase."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "id": "036b0a54-48e4-4414-bb8c-0c30b6ab7469",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Split data into train and validate datasets\n",
    "train, validate = train_test_split(df, test_size=0.2, random_state=SEED)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "4b65cbf7-5981-4f51-97aa-a3ff7027f2f3",
   "metadata": {},
   "source": [
    "### Exploratory Data Analysis"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "e516ded8-10ad-4274-a736-f288290b5883",
   "metadata": {},
   "source": [
    "Before building a model, a data scientist needs to gain understanding of the data to make sure it meets important statistical assumptions, and to identify potential opportunities and issues. As the purpose of this particular example is to show working with Feast, we will take the view of a data scientist looking to build a quick baseline model to establish some low-end metrics.\n",
    "\n",
    "Note that this data set is very \"clean\", as it has already been prepared. In real-life, production credit risk data can be much more complex, and have many issues that need to be understood and addressed before modeling."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "553986a0-c804-4ab4-a4b9-48b16c72fd4f",
   "metadata": {},
   "source": [
    "Let's look at counts for the target variable `class`, which tells us whether a (historical) loan was good or bad. We can see that there were many more good loans than bad, making the dataset imbalanced."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "id": "607bd29b-eaf4-41a6-aaca-a8eaaf37e2d2",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "image/png": "iVBORw0KGgoAAAANSUhEUgAAAjsAAAHHCAYAAABZbpmkAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjkuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8hTgPZAAAACXBIWXMAAA9hAAAPYQGoP6dpAAAysElEQVR4nO3deVxV5d7///dmRmWDoIKWqJWKOHZw2k1qkWRkeWvllKlHG8FKy2PcOWLedqwcQ6tTqWWm2WBq5kRZHcVSTFNT1MqwFCgVtnoUBNbvj37sb/ugpQhsvHw9H4/1yHVd11rrc+3d1jdr2Ngsy7IEAABgKC9PFwAAAFCRCDsAAMBohB0AAGA0wg4AADAaYQcAABiNsAMAAIxG2AEAAEYj7AAAAKMRdgAAgNEIO8AlyGazafz48Z4u45LD6wZcngg7QDmYN2+ebDab21KnTh116dJFn3zyiafLq3T5+fmaNWuWbrjhBtWsWVN+fn6qV6+e7rzzTr3zzjsqKirydInndODAAdlsNr3wwgueLuWCOZ1OTZgwQa1bt1aNGjUUGBioFi1aaNSoUTp06JCny5MkrVy5ksCJSufj6QIAkyQnJ6tRo0ayLEvZ2dmaN2+ebr/9di1fvlx33HGHp8urFL/++qu6deum9PR0xcXFafTo0QoNDVVWVpbWrVunfv36af/+/RozZoynSzXKDz/8oNjYWGVmZuqee+7Rgw8+KD8/P3377bd6/fXX9eGHH2rv3r2eLlMrV65USkoKgQeVirADlKNu3bqpbdu2rvUhQ4YoPDxc77zzzmUTdgYMGKBvvvlG77//vnr27OnWl5SUpC1btigjI8ND1ZmpsLBQPXv2VHZ2ttavX68bbrjBrX/SpEn65z//6aHqAM/jMhZQgUJCQhQYGCgfH/efK1544QVdd911CgsLU2BgoGJiYvTee++V2j4/P1/Dhw9X7dq1FRQUpDvvvFM///zzXx43OztbPj4+mjBhQqm+jIwM2Ww2vfTSS5KkM2fOaMKECWrcuLECAgIUFhamG264QWvXrr3g+aalpWn16tV68MEHSwWdEm3btlX//v3d2nJyclzBMCAgQK1bt9b8+fNLbXvy5Ek9+eSTql+/vvz9/dW0aVO98MILsizLbVxZX7cLcb41n+97bbPZlJiYqKVLl6pFixby9/dX8+bNtWrVqr+s5f3339f27dv1zDPPlAo6kmS32zVp0iS3tiVLligmJkaBgYGqVauW7rvvPv3yyy9uYzp37qzOnTuX2t+gQYPUsGFD1/ofL/29+uqruvrqq+Xv76927dpp8+bNbtulpKS45luyABWNMztAOcrLy9Nvv/0my7KUk5OjWbNm6cSJE7rvvvvcxs2YMUN33nmn+vfvr4KCAi1atEj33HOPVqxYofj4eNe4oUOHasGCBerXr5+uu+46ffrpp2795xIeHq5OnTrp3Xff1bhx49z6Fi9eLG9vb91zzz2SpPHjx2vy5MkaOnSo2rdvL6fTqS1btmjr1q269dZbL2j+y5cvl6RS8/0zp06dUufOnbV//34lJiaqUaNGWrJkiQYNGqTc3Fw9/vjjkiTLsnTnnXfqs88+05AhQ9SmTRutXr1aI0eO1C+//KJp06a59lnW1628a5bO/72WpH//+9/64IMP9OijjyooKEgzZ85Ur169lJmZqbCwsHPWs2zZMkm/n1U7H/PmzdPgwYPVrl07TZ48WdnZ2ZoxY4Y2bNigb775RiEhIRf+okhauHChjh8/roceekg2m01TpkxRz5499cMPP8jX11cPPfSQDh06pLVr1+qtt94q0zGAMrEAXLS5c+dakkot/v7+1rx580qN/89//uO2XlBQYLVo0cK6+eabXW3btm2zJFmPPvqo29h+/fpZkqxx48b9aU2vvPKKJcnasWOHW3t0dLTbcVq3bm3Fx8ef71T/1P/8z/9Ykqzc3Fy39lOnTlm//vqrazl27Jirb/r06ZYka8GCBa62goICy+FwWDVq1LCcTqdlWZa1dOlSS5L17LPPuu377rvvtmw2m7V//37Lsi7+dfvxxx8tSdbzzz9/zjHnW7Nlnd97bVmWJcny8/NzzcOyLGv79u2WJGvWrFl/WvO1115rBQcH/+mYPx6/Tp06VosWLaxTp0652lesWGFJssaOHetq69Spk9WpU6dS+xg4cKDVoEED13rJaxYWFmYdPXrU1f7RRx9Zkqzly5e72hISEiz+6UFl4zIWUI5SUlK0du1arV27VgsWLFCXLl00dOhQffDBB27jAgMDXX8+duyY8vLydOONN2rr1q2u9pUrV0qSHnvsMbdtn3jiifOqpWfPnvLx8dHixYtdbTt37tR3332n3r17u9pCQkK0a9cu7du377zneS5Op1OSVKNGDbf2l19+WbVr13Ytf7zUsnLlSkVERKhv376uNl9fXz322GM6ceKEPv/8c9c4b2/vUq/Hk08+KcuyXE+9Xezrdj7Ot2bp/N7rErGxsbr66qtd661atZLdbtcPP/zwp/U4nU4FBQWdV+1btmxRTk6OHn30UQUEBLja4+PjFRUVpY8//vi89nM2vXv3Vs2aNV3rN954oyT9Zf1ARSPsAOWoffv2io2NVWxsrPr376+PP/5Y0dHRSkxMVEFBgWvcihUr1LFjRwUEBCg0NFS1a9fWnDlzlJeX5xrz008/ycvLy+0fP0lq2rTpedVSq1Yt3XLLLXr33XddbYsXL5aPj4/b/TTJycnKzc1VkyZN1LJlS40cOVLffvttmeZf8g/uiRMn3Np79erlCoGtWrVy6/vpp5/UuHFjeXm5/3XUrFkzV3/Jf+vVq1fqH/WzjbuY1+18nG/N0vm91yUiIyNLtdWsWVPHjh3703rsdruOHz9+3rVLZ389oqKi3Gq/UP9df0nw+av6gYpG2AEqkJeXl7p06aLDhw+7zpx8+eWXuvPOOxUQEKDZs2dr5cqVWrt2rfr161fqRtuL1adPH+3du1fbtm2TJL377ru65ZZbVKtWLdeYm266Sd9//73eeOMNtWjRQq+99pr+9re/6bXXXrvg40VFRUn6/QzSH9WvX98VAv/4k7/pLvS99vb2Put+/ur/i6ioKOXl5engwYPlUneJc908fK7vSSpr/UBFI+wAFaywsFDS/zvb8f777ysgIECrV6/W3//+d3Xr1k2xsbGltmvQoIGKi4v1/fffu7VfyGPbPXr0kJ+fnxYvXqxt27Zp79696tOnT6lxoaGhGjx4sN555x0dPHhQrVq1KtP3oJQ8Xv/222+f9zYNGjTQvn37VFxc7Na+Z88eV3/Jfw8dOlTqDMbZxl3s61ZeNZ/ve32xunfvLklasGDBX44tqe1sr0dGRoarX/r9zExubm6pcRdz9oenr+AJhB2gAp05c0Zr1qyRn5+f6xKHt7e3bDab20/HBw4c0NKlS9227datmyRp5syZbu3Tp08/7+OHhIQoLi5O7777rhYtWiQ/Pz/16NHDbcyRI0fc1mvUqKFrrrlG+fn5rra8vDzt2bPnrJde/uj666/XrbfeqldffVUfffTRWcf890/5t99+u7KystzuLSosLNSsWbNUo0YNderUyTWuqKjI9ch8iWnTpslms7ler/J43f7K+dZ8vu/1xbr77rvVsmVLTZo0SWlpaaX6jx8/rmeeeUbS74/+16lTRy+//LLbe/zJJ59o9+7dbk+IXX311dqzZ49+/fVXV9v27du1YcOGMtdavXp1STpriAIqCo+eA+Xok08+cf10n5OTo4ULF2rfvn16+umnZbfbJf1+I+jUqVN12223qV+/fsrJyVFKSoquueYat3tl2rRpo759+2r27NnKy8vTddddp9TUVO3fv/+Caurdu7fuu+8+zZ49W3FxcaUeK46Ojlbnzp0VExOj0NBQbdmyRe+9954SExNdYz788EMNHjxYc+fO1aBBg/70eAsWLNBtt92mHj16uM5k1KxZ0/UNyl988YUrkEjSgw8+qFdeeUWDBg1Senq6GjZsqPfee08bNmzQ9OnTXffodO/eXV26dNEzzzyjAwcOqHXr1lqzZo0++ugjPfHEE657dMrrdUtNTdXp06dLtffo0eO8az7f9/pi+fr66oMPPlBsbKxuuukm3Xvvvbr++uvl6+urXbt2aeHChapZs6YmTZokX19f/fOf/9TgwYPVqVMn9e3b1/XoecOGDTV8+HDXfv/+979r6tSpiouL05AhQ5STk6OXX35ZzZs3d92MfqFiYmIk/X4DeVxcnLy9vc96thEoV558FAwwxdkePQ8ICLDatGljzZkzxyouLnYb//rrr1uNGze2/P39raioKGvu3LnWuHHjSj2Se+rUKeuxxx6zwsLCrOrVq1vdu3e3Dh48eF6PUJdwOp1WYGBgqUelSzz77LNW+/btrZCQECswMNCKioqyJk2aZBUUFJSa39y5c8/rmKdOnbKmT59uORwOy263Wz4+PlZERIR1xx13WG+//bZVWFjoNj47O9saPHiwVatWLcvPz89q2bLlWY91/Phxa/jw4Va9evUsX19fq3Hjxtbzzz9f6vW9mNet5DHqcy1vvfXWBdV8vu+1JCshIaHU9g0aNLAGDhz4pzWXOHbsmDV27FirZcuWVrVq1ayAgACrRYsWVlJSknX48GG3sYsXL7auvfZay9/f3woNDbX69+9v/fzzz6X2uWDBAuuqq66y/Pz8rDZt2lirV68+56PnZ3tc/79f88LCQmvYsGFW7dq1LZvNxmPoqBQ2y+LOMQAAYC7u2QEAAEYj7AAAAKMRdgAAgNEIOwAAwGiEHQAAYDTCDgAAMBphR79/o6vT6eT3twAAYCDCjn7/KvXg4ODz/q3BAADg0kHYAQAARiPsAAAAoxF2AACA0Qg7AADAaIQdAABgNMIOAAAwGmEHAAAYjbADAACMRtgBAABGI+wAAACjEXYAAIDRCDsAAMBohB0AAGA0wg4AADAaYQcAABiNsAMAAIxG2AEAAEYj7AAAAKMRdgAAgNF8PF0AAFSW02eKlHn0P54uA7gsRIZWU4Cvt6fLkETYAXAZyTz6H439aKenywAuC8l3tVCT8CBPlyGJy1gAAMBwhB0AAGA0wg4AADAaYQcAABiNsAMAAIxG2AEAAEYj7AAAAKMRdgAAgNEIOwAAwGiEHQAAYDTCDgAAMBphBwAAGI2wAwAAjEbYAQAARiPsAAAAoxF2AACA0Qg7AADAaIQdAABgNMIOAAAwGmEHAAAYjbADAACMRtgBAABGI+wAAACjEXYAAIDRCDsAAMBohB0AAGA0wg4AADAaYQcAABiNsAMAAIxG2AEAAEYj7AAAAKMRdgAAgNEIOwAAwGiEHQAAYDSPhp3x48fLZrO5LVFRUa7+06dPKyEhQWFhYapRo4Z69eql7Oxst31kZmYqPj5e1apVU506dTRy5EgVFhZW9lQAAEAV5ePpApo3b65169a51n18/l9Jw4cP18cff6wlS5YoODhYiYmJ6tmzpzZs2CBJKioqUnx8vCIiIrRx40YdPnxY999/v3x9ffV///d/lT4XAABQ9Xg87Pj4+CgiIqJUe15enl5//XUtXLhQN998syRp7ty5atasmTZt2qSOHTtqzZo1+u6777Ru3TqFh4erTZs2mjhxokaNGqXx48fLz8+vsqcDAACqGI/fs7Nv3z7Vq1dPV111lfr376/MzExJUnp6us6cOaPY2FjX2KioKEVGRiotLU2SlJaWppYtWyo8PNw1Ji4uTk6nU7t27TrnMfPz8+V0Ot0WAABgJo+GnQ4dOmjevHlatWqV5syZox9//FE33nijjh8/rqysLPn5+SkkJMRtm/DwcGVlZUmSsrKy3IJOSX9J37lMnjxZwcHBrqV+/frlOzEAAFBlePQyVrdu3Vx/btWqlTp06KAGDRro3XffVWBgYIUdNykpSSNGjHCtO51OAg8AAIby+GWsPwoJCVGTJk20f/9+RUREqKCgQLm5uW5jsrOzXff4RERElHo6q2T9bPcBlfD395fdbndbAACAmapU2Dlx4oS+//571a1bVzExMfL19VVqaqqrPyMjQ5mZmXI4HJIkh8OhHTt2KCcnxzVm7dq1stvtio6OrvT6AQBA1ePRy1hPPfWUunfvrgYNGujQoUMaN26cvL291bdvXwUHB2vIkCEaMWKEQkNDZbfbNWzYMDkcDnXs2FGS1LVrV0VHR2vAgAGaMmWKsrKyNHr0aCUkJMjf39+TUwMAAFWER8POzz//rL59++rIkSOqXbu2brjhBm3atEm1a9eWJE2bNk1eXl7q1auX8vPzFRcXp9mzZ7u29/b21ooVK/TII4/I4XCoevXqGjhwoJKTkz01JQAAUMXYLMuyPF2EpzmdTgUHBysvL4/7dwCD7c0+rrEf7fR0GcBlIfmuFmoSHuTpMiRVsXt2AAAAyhthBwAAGI2wAwAAjEbYAQAARiPsAAAAoxF2AACA0Qg7AADAaIQdAABgNMIOAAAwGmEHAAAYjbADAACMRtgBAABGI+wAAACjEXYAAIDRCDsAAMBohB0AAGA0wg4AADAaYQcAABiNsAMAAIxG2AEAAEYj7AAAAKMRdgAAgNEIOwAAwGiEHQAAYDTCDgAAMBphBwAAGI2wAwAAjEbYAQAARiPsAAAAoxF2AACA0Qg7AADAaIQdAABgNMIOAAAwGmEHAAAYjbADAACMRtgBAABGI+wAAACjEXYAAIDRCDsAAMBohB0AAGA0wg4AADAaYQcAABiNsAMAAIxG2AEAAEYj7AAAAKMRdgAAgNEIOwAAwGiEHQAAYDTCDgAAMBphBwAAGI2wAwAAjFZlws5zzz0nm82mJ554wtV2+vRpJSQkKCwsTDVq1FCvXr2UnZ3ttl1mZqbi4+NVrVo11alTRyNHjlRhYWElVw8AAKqqKhF2Nm/erFdeeUWtWrVyax8+fLiWL1+uJUuW6PPPP9ehQ4fUs2dPV39RUZHi4+NVUFCgjRs3av78+Zo3b57Gjh1b2VMAAABVlMfDzokTJ9S/f3/961//Us2aNV3teXl5ev311zV16lTdfPPNiomJ0dy5c7Vx40Zt2rRJkrRmzRp99913WrBggdq0aaNu3bpp4sSJSklJUUFBgaemBAAAqhCPh52EhATFx8crNjbWrT09PV1nzpxxa4+KilJkZKTS0tIkSWlpaWrZsqXCw8NdY+Li4uR0OrVr165zHjM/P19Op9NtAQAAZvLx5MEXLVqkrVu3avPmzaX6srKy5Ofnp5CQELf28PBwZWVlucb8MeiU9Jf0ncvkyZM1YcKEi6weAABcCjx2ZufgwYN6/PHH9fbbbysgIKBSj52UlKS8vDzXcvDgwUo9PgAAqDweCzvp6enKycnR3/72N/n4+MjHx0eff/65Zs6cKR8fH4WHh6ugoEC5ublu22VnZysiIkKSFBERUerprJL1kjFn4+/vL7vd7rYAAAAzeSzs3HLLLdqxY4e2bdvmWtq2bav+/fu7/uzr66vU1FTXNhkZGcrMzJTD4ZAkORwO7dixQzk5Oa4xa9euld1uV3R0dKXPCQAAVD0eu2cnKChILVq0cGurXr26wsLCXO1DhgzRiBEjFBoaKrvdrmHDhsnhcKhjx46SpK5duyo6OloDBgzQlClTlJWVpdGjRyshIUH+/v6VPicAAFD1ePQG5b8ybdo0eXl5qVevXsrPz1dcXJxmz57t6vf29taKFSv0yCOPyOFwqHr16ho4cKCSk5M9WDUAAKhKbJZlWZ4uwtOcTqeCg4OVl5fH/TuAwfZmH9fYj3Z6ugzgspB8Vws1CQ/ydBmSqsD37AAAAFQkwg4AADAaYQcAABiNsAMAAIxG2AEAAEYj7AAAAKMRdgAAgNEIOwAAwGiEHQAAYDTCDgAAMBphBwAAGI2wAwAAjEbYAQAARiPsAAAAoxF2AACA0Qg7AADAaIQdAABgNMIOAAAwGmEHAAAYjbADAACMRtgBAABGI+wAAACjEXYAAIDRCDsAAMBohB0AAGA0wg4AADAaYQcAABiNsAMAAIxG2AEAAEYj7AAAAKMRdgAAgNEIOwAAwGiEHQAAYDTCDgAAMBphBwAAGI2wAwAAjEbYAQAARiPsAAAAo5Up7Fx11VU6cuRIqfbc3FxdddVVF10UAABAeSlT2Dlw4ICKiopKtefn5+uXX3656KIAAADKi8+FDF62bJnrz6tXr1ZwcLBrvaioSKmpqWrYsGG5FQcAAHCxLijs9OjRQ5Jks9k0cOBAtz5fX181bNhQL774YrkVBwAAcLEuKOwUFxdLkho1aqTNmzerVq1aFVIUAABAebmgsFPixx9/LO86AAAAKkSZwo4kpaamKjU1VTk5Oa4zPiXeeOONiy4MAACgPJQp7EyYMEHJyclq27at6tatK5vNVt51AQAAlIsyhZ2XX35Z8+bN04ABA8q7HgAAgHJVpu/ZKSgo0HXXXVfetQAAAJS7MoWdoUOHauHCheVdCwAAQLkr02Ws06dP69VXX9W6devUqlUr+fr6uvVPnTq1XIoDAAC4WGUKO99++63atGkjSdq5c6dbHzcrAwCAqqRMl7E+++yzcy6ffvrpee9nzpw5atWqlex2u+x2uxwOhz755BNX/+nTp5WQkKCwsDDVqFFDvXr1UnZ2tts+MjMzFR8fr2rVqqlOnToaOXKkCgsLyzItAABgoDKFnfJy5ZVX6rnnnlN6erq2bNmim2++WXfddZd27dolSRo+fLiWL1+uJUuW6PPPP9ehQ4fUs2dP1/ZFRUWKj49XQUGBNm7cqPnz52vevHkaO3asp6YEAACqGJtlWdaFbtSlS5c/vVx1IWd3/ltoaKief/553X333apdu7YWLlyou+++W5K0Z88eNWvWTGlpaerYsaM++eQT3XHHHTp06JDCw8Ml/f5Y/KhRo/Trr7/Kz8/vvI7pdDoVHBysvLw82e32MtcOoGrbm31cYz/a+dcDAVy05LtaqEl4kKfLkFTGMztt2rRR69atXUt0dLQKCgq0detWtWzZskyFFBUVadGiRTp58qQcDofS09N15swZxcbGusZERUUpMjJSaWlpkqS0tDS1bNnSFXQkKS4uTk6n03V26Gzy8/PldDrdFgAAYKYy3aA8bdq0s7aPHz9eJ06cuKB97dixQw6HQ6dPn1aNGjX04YcfKjo6Wtu2bZOfn59CQkLcxoeHhysrK0uSlJWV5RZ0SvpL+s5l8uTJmjBhwgXVCQAALk3les/Offfdd8G/F6tp06batm2bvvrqKz3yyCMaOHCgvvvuu/Isq5SkpCTl5eW5loMHD1bo8QAAgOeU+ReBnk1aWpoCAgIuaBs/Pz9dc801kqSYmBht3rxZM2bMUO/evVVQUKDc3Fy3szvZ2dmKiIiQJEVEROjrr79221/J01olY87G399f/v7+F1QnAAC4NJUp7PzxiShJsixLhw8f1pYtWzRmzJiLKqi4uFj5+fmKiYmRr6+vUlNT1atXL0lSRkaGMjMz5XA4JEkOh0OTJk1STk6O6tSpI0lau3at7Ha7oqOjL6oOAABghjKFneDgYLd1Ly8vNW3aVMnJyeratet57ycpKUndunVTZGSkjh8/roULF2r9+vVavXq1goODNWTIEI0YMUKhoaGy2+0aNmyYHA6HOnbsKEnq2rWroqOjNWDAAE2ZMkVZWVkaPXq0EhISOHMDAAAklTHszJ07t1wOnpOTo/vvv1+HDx9WcHCwWrVqpdWrV+vWW2+V9PuN0F5eXurVq5fy8/MVFxen2bNnu7b39vbWihUr9Mgjj8jhcKh69eoaOHCgkpOTy6U+AABw6SvT9+yUSE9P1+7duyVJzZs317XXXltuhVUmvmcHuDzwPTtA5alK37NTpjM7OTk56tOnj9avX++6eTg3N1ddunTRokWLVLt27fKsEQAAoMzK9Oj5sGHDdPz4ce3atUtHjx7V0aNHtXPnTjmdTj322GPlXSMAAECZlenMzqpVq7Ru3To1a9bM1RYdHa2UlJQLukEZAACgopXpzE5xcbF8fX1Ltfv6+qq4uPiiiwIAACgvZQo7N998sx5//HEdOnTI1fbLL79o+PDhuuWWW8qtOAAAgItVprDz0ksvyel0qmHDhrr66qt19dVXq1GjRnI6nZo1a1Z51wgAAFBmZbpnp379+tq6davWrVunPXv2SJKaNWvm9hvKAQAAqoILOrPz6aefKjo6Wk6nUzabTbfeequGDRumYcOGqV27dmrevLm+/PLLiqoVAADggl1Q2Jk+fboeeOCBs37xXnBwsB566CFNnTq13IoDAAC4WBcUdrZv367bbrvtnP1du3ZVenr6RRcFAABQXi4o7GRnZ5/1kfMSPj4++vXXXy+6KAAAgPJyQWHniiuu0M6d5/69Mt9++63q1q170UUBAACUlwsKO7fffrvGjBmj06dPl+o7deqUxo0bpzvuuKPcigMAALhYF/To+ejRo/XBBx+oSZMmSkxMVNOmTSVJe/bsUUpKioqKivTMM89USKEAAABlcUFhJzw8XBs3btQjjzyipKQkWZYlSbLZbIqLi1NKSorCw8MrpFAAAICyuOAvFWzQoIFWrlypY8eOaf/+/bIsS40bN1bNmjUroj4AAICLUqZvUJakmjVrql27duVZCwAAQLkr0+/GAgAAuFQQdgAAgNEIOwAAwGiEHQAAYDTCDgAAMBphBwAAGI2wAwAAjEbYAQAARiPsAAAAoxF2AACA0Qg7AADAaIQdAABgNMIOAAAwGmEHAAAYjbADAACMRtgBAABGI+wAAACjEXYAAIDRCDsAAMBohB0AAGA0H08XcLk4faZImUf/4+kygMvGNbVryMvL5ukyAFQBhJ1Kknn0Pxr70U5PlwFcNuYOaq9AP29PlwGgCuAyFgAAMBphBwAAGI2wAwAAjEbYAQAARiPsAAAAoxF2AACA0Qg7AADAaIQdAABgNMIOAAAwGmEHAAAYjbADAACM5tGwM3nyZLVr105BQUGqU6eOevTooYyMDLcxp0+fVkJCgsLCwlSjRg316tVL2dnZbmMyMzMVHx+vatWqqU6dOho5cqQKCwsrcyoAAKCK8mjY+fzzz5WQkKBNmzZp7dq1OnPmjLp27aqTJ0+6xgwfPlzLly/XkiVL9Pnnn+vQoUPq2bOnq7+oqEjx8fEqKCjQxo0bNX/+fM2bN09jx471xJQAAEAV49Hfer5q1Sq39Xnz5qlOnTpKT0/XTTfdpLy8PL3++utauHChbr75ZknS3Llz1axZM23atEkdO3bUmjVr9N1332ndunUKDw9XmzZtNHHiRI0aNUrjx4+Xn5+fJ6YGAACqiCp1z05eXp4kKTQ0VJKUnp6uM2fOKDY21jUmKipKkZGRSktLkySlpaWpZcuWCg8Pd42Ji4uT0+nUrl27KrF6AABQFXn0zM4fFRcX64knntD111+vFi1aSJKysrLk5+enkJAQt7Hh4eHKyspyjflj0CnpL+k7m/z8fOXn57vWnU5neU0DAABUMVXmzE5CQoJ27typRYsWVfixJk+erODgYNdSv379Cj8mAADwjCoRdhITE7VixQp99tlnuvLKK13tERERKigoUG5urtv47OxsRUREuMb899NZJeslY/5bUlKS8vLyXMvBgwfLcTYAAKAq8WjYsSxLiYmJ+vDDD/Xpp5+qUaNGbv0xMTHy9fVVamqqqy0jI0OZmZlyOBySJIfDoR07dignJ8c1Zu3atbLb7YqOjj7rcf39/WW3290WAABgJo/es5OQkKCFCxfqo48+UlBQkOsem+DgYAUGBio4OFhDhgzRiBEjFBoaKrvdrmHDhsnhcKhjx46SpK5duyo6OloDBgzQlClTlJWVpdGjRyshIUH+/v6enB4AAKgCPBp25syZI0nq3LmzW/vcuXM1aNAgSdK0adPk5eWlXr16KT8/X3FxcZo9e7ZrrLe3t1asWKFHHnlEDodD1atX18CBA5WcnFxZ0wAAAFWYR8OOZVl/OSYgIEApKSlKSUk555gGDRpo5cqV5VkaAAAwRJW4QRkAAKCiEHYAAIDRCDsAAMBohB0AAGA0wg4AADAaYQcAABiNsAMAAIxG2AEAAEYj7AAAAKMRdgAAgNEIOwAAwGiEHQAAYDTCDgAAMBphBwAAGI2wAwAAjEbYAQAARiPsAAAAoxF2AACA0Qg7AADAaIQdAABgNMIOAAAwGmEHAAAYjbADAACMRtgBAABGI+wAAACjEXYAAIDRCDsAAMBohB0AAGA0wg4AADAaYQcAABiNsAMAAIxG2AEAAEYj7AAAAKMRdgAAgNEIOwAAwGiEHQAAYDTCDgAAMBphBwAAGI2wAwAAjEbYAQAARiPsAAAAoxF2AACA0Qg7AADAaIQdAABgNMIOAAAwGmEHAAAYjbADAACMRtgBAABGI+wAAACjEXYAAIDRCDsAAMBoHg07X3zxhbp376569erJZrNp6dKlbv2WZWns2LGqW7euAgMDFRsbq3379rmNOXr0qPr37y+73a6QkBANGTJEJ06cqMRZAACAqsyjYefkyZNq3bq1UlJSzto/ZcoUzZw5Uy+//LK++uorVa9eXXFxcTp9+rRrTP/+/bVr1y6tXbtWK1as0BdffKEHH3ywsqYAAACqOB9PHrxbt27q1q3bWfssy9L06dM1evRo3XXXXZKkN998U+Hh4Vq6dKn69Omj3bt3a9WqVdq8ebPatm0rSZo1a5Zuv/12vfDCC6pXr16lzQUAAFRNVfaenR9//FFZWVmKjY11tQUHB6tDhw5KS0uTJKWlpSkkJMQVdCQpNjZWXl5e+uqrryq9ZgAAUPV49MzOn8nKypIkhYeHu7WHh4e7+rKyslSnTh23fh8fH4WGhrrGnE1+fr7y8/Nd606ns7zKBgAAVUyVPbNTkSZPnqzg4GDXUr9+fU+XBAAAKkiVDTsRERGSpOzsbLf27OxsV19ERIRycnLc+gsLC3X06FHXmLNJSkpSXl6eazl48GA5Vw8AAKqKKht2GjVqpIiICKWmprranE6nvvrqKzkcDkmSw+FQbm6u0tPTXWM+/fRTFRcXq0OHDufct7+/v+x2u9sCAADM5NF7dk6cOKH9+/e71n/88Udt27ZNoaGhioyM1BNPPKFnn31WjRs3VqNGjTRmzBjVq1dPPXr0kCQ1a9ZMt912mx544AG9/PLLOnPmjBITE9WnTx+exAIAAJI8HHa2bNmiLl26uNZHjBghSRo4cKDmzZunf/zjHzp58qQefPBB5ebm6oYbbtCqVasUEBDg2ubtt99WYmKibrnlFnl5ealXr16aOXNmpc8FAABUTR4NO507d5ZlWefst9lsSk5OVnJy8jnHhIaGauHChRVRHgAAMECVvWcHAACgPBB2AACA0Qg7AADAaIQdAABgNMIOAAAwGmEHAAAYjbADAACMRtgBAABGI+wAAACjEXYAAIDRCDsAAMBohB0AAGA0wg4AADAaYQcAABiNsAMAAIxG2AEAAEYj7AAAAKMRdgAAgNEIOwAAwGiEHQAAYDTCDgAAMBphBwAAGI2wAwAAjEbYAQAARiPsAAAAoxF2AACA0Qg7AADAaIQdAABgNMIOAAAwGmEHAAAYjbADAACMRtgBAABGI+wAAACjEXYAAIDRCDsAAMBohB0AAGA0wg4AADAaYQcAABiNsAMAAIxG2AEAAEYj7AAAAKMRdgAAgNEIOwAAwGiEHQAAYDTCDgAAMBphBwAAGI2wAwAAjEbYAQAARiPsAAAAoxF2AACA0Qg7AADAaMaEnZSUFDVs2FABAQHq0KGDvv76a0+XBAAAqgAjws7ixYs1YsQIjRs3Tlu3blXr1q0VFxennJwcT5cGAAA8zGZZluXpIi5Whw4d1K5dO7300kuSpOLiYtWvX1/Dhg3T008//ZfbO51OBQcHKy8vT3a7vUJqPH2mSJlH/1Mh+wZQ2jW1a8jLy+bWxucQqDyRodUU4Ovt6TIkST6eLuBiFRQUKD09XUlJSa42Ly8vxcbGKi0tzYOVuQvw9VaT8CBPlwFc1vgcApenSz7s/PbbbyoqKlJ4eLhbe3h4uPbs2XPWbfLz85Wfn+9az8vLk/T7GR4AAHBpCQoKks1mO2f/JR92ymLy5MmaMGFCqfb69et7oBoAAHAx/uo2lEs+7NSqVUve3t7Kzs52a8/OzlZERMRZt0lKStKIESNc68XFxTp69KjCwsL+NBni8uN0OlW/fn0dPHiwwu7nAnBufAZxPoKC/vzy9CUfdvz8/BQTE6PU1FT16NFD0u/hJTU1VYmJiWfdxt/fX/7+/m5tISEhFVwpLmV2u52/aAEP4jOIi3HJhx1JGjFihAYOHKi2bduqffv2mj59uk6ePKnBgwd7ujQAAOBhRoSd3r1769dff9XYsWOVlZWlNm3aaNWqVaVuWgYAAJcfI8KOJCUmJp7zshVQVv7+/ho3blypy54AKgefQZQHI75UEAAA4FyM+HURAAAA50LYAQAARiPsAAAAoxF2cNnp3LmznnjiiXLd5/r162Wz2ZSbm1uu+wVwcRo2bKjp06d7ugx4GGEHAAAYjbADAACMRtjBZamwsFCJiYkKDg5WrVq1NGbMGJV8C8Nbb72ltm3bKigoSBEREerXr59ycnLctl+5cqWaNGmiwMBAdenSRQcOHPDALIBLx/Hjx9W/f39Vr15ddevW1bRp09wuKR87dkz333+/atasqWrVqqlbt27at2+f2z7ef/99NW/eXP7+/mrYsKFefPFFt/6cnBx1795dgYGBatSokd5+++3Kmh6qOMIOLkvz58+Xj4+Pvv76a82YMUNTp07Va6+9Jkk6c+aMJk6cqO3bt2vp0qU6cOCABg0a5Nr24MGD6tmzp7p3765t27Zp6NChevrppz00E+DSMGLECG3YsEHLli3T2rVr9eWXX2rr1q2u/kGDBmnLli1atmyZ0tLSZFmWbr/9dp05c0aSlJ6ernvvvVd9+vTRjh07NH78eI0ZM0bz5s1z28fBgwf12Wef6b333tPs2bNL/aCCy5QFXGY6depkNWvWzCouLna1jRo1ymrWrNlZx2/evNmSZB0/ftyyLMtKSkqyoqOj3caMGjXKkmQdO3aswuoGLlVOp9Py9fW1lixZ4mrLzc21qlWrZj3++OPW3r17LUnWhg0bXP2//fabFRgYaL377ruWZVlWv379rFtvvdVtvyNHjnR9FjMyMixJ1tdff+3q3717tyXJmjZtWgXODpcCzuzgstSxY0fZbDbXusPh0L59+1RUVKT09HR1795dkZGRCgoKUqdOnSRJmZmZkqTdu3erQ4cObvtzOByVVzxwifnhhx905swZtW/f3tUWHByspk2bSvr9M+Xj4+P2uQoLC1PTpk21e/du15jrr7/ebb/XX3+963Nbso+YmBhXf1RUlEJCQipwZrhUEHaAPzh9+rTi4uJkt9v19ttva/Pmzfrwww8lSQUFBR6uDgBQFoQdXJa++uort/VNmzapcePG2rNnj44cOaLnnntON954o6Kiokpd82/WrJm+/vrrUtsDOLurrrpKvr6+2rx5s6stLy9Pe/fulfT7Z6qwsNDtc3nkyBFlZGQoOjraNWbDhg1u+92wYYOaNGkib29vRUVFqbCwUOnp6a7+jIwMvvsKkgg7uExlZmZqxIgRysjI0DvvvKNZs2bp8ccfV2RkpPz8/DRr1iz98MMPWrZsmSZOnOi27cMPP6x9+/Zp5MiRysjI0MKFC91ukgTgLigoSAMHDtTIkSP12WefadeuXRoyZIi8vLxks9nUuHFj3XXXXXrggQf073//W9u3b9d9992nK664QnfddZck6cknn1RqaqomTpyovXv3av78+XrppZf01FNPSZKaNm2q2267TQ899JC++uorpaena+jQoQoMDPTk1FFVePqmIaCyderUyXr00Uethx9+2LLb7VbNmjWt//3f/3XdsLxw4UKrYcOGlr+/v+VwOKxly5ZZkqxvvvnGtY/ly5db11xzjeXv72/deOON1htvvMENysCfcDqdVr9+/axq1apZERER1tSpU6327dtbTz/9tGVZlnX06FFrwIABVnBwsBUYGGjFxcVZe/fuddvHe++9Z0VHR1u+vr5WZGSk9fzzz7v1Hz582IqPj7f8/f2tyMhI680337QaNGjADcqwbJb1/3+5CAAAleTkyZO64oor9OKLL2rIkCGeLgeG8/F0AQAA833zzTfas2eP2rdvr7y8PCUnJ0uS6zIVUJEIOwCASvHCCy8oIyNDfn5+iomJ0ZdffqlatWp5uixcBriMBQAAjMbTWAAAwGiEHQAAYDTCDgAAMBphBwAAGI2wA+CSdeDAAdlsNm3bts3TpQCowgg7AADAaIQdAABgNMIOgCqvuLhYU6ZM0TXXXCN/f39FRkZq0qRJpcYVFRVpyJAhatSokQIDA9W0aVPNmDHDbcz69evVvn17Va9eXSEhIbr++uv1008/SZK2b9+uLl26KCgoSHa7XTExMdqyZUulzBFAxeEblAFUeUlJSfrXv/6ladOm6YYbbtDhw4e1Z8+eUuOKi4t15ZVXasmSJQoLC9PGjRv14IMPqm7durr33ntVWFioHj166IEHHtA777yjgoICff3117LZbJKk/v3769prr9WcOXPk7e2tbdu2ydfXt7KnC6Cc8Q3KAKq048ePq3bt2nrppZc0dOhQt74DBw6oUaNG+uabb9SmTZuzbp+YmKisrCy99957Onr0qMLCwrR+/Xp16tSp1Fi73a5Zs2Zp4MCBFTEVAB7CZSwAVdru3buVn5+vW2655bzGp6SkKCYmRrVr11aNGjX06quvKjMzU5IUGhqqQYMGKS4uTt27d9eMGTN0+PBh17YjRozQ0KFDFRsbq+eee07ff/99hcwJQOUi7ACo0gIDA8977KJFi/TUU09pyJAhWrNmjbZt26bBgweroKDANWbu3LlKS0vTddddp8WLF6tJkybatGmTJGn8+PHatWuX4uPj9emnnyo6Oloffvhhuc8JQOXiMhaAKu306dMKDQ3VzJkz//Iy1rBhw/Tdd98pNTXVNSY2Nla//fbbOb+Lx+FwqF27dpo5c2apvr59++rkyZNatmxZuc4JQOXizA6AKi0gIECjRo3SP/7xD7355pv6/vvvtWnTJr3++uulxjZu3FhbtmzR6tWrtXfvXo0ZM0abN2929f/4449KSkpSWlqafvrpJ61Zs0b79u1Ts2bNdOrUKSUmJmr9+vX66aeftGHDBm3evFnNmjWrzOkCqAA8jQWgyhszZox8fHw0duxYHTp0SHXr1tXDDz9catxDDz2kb775Rr1795bNZlPfvn316KOP6pNPPpEkVatWTXv27NH8+fN15MgR1a1bVwkJCXrooYdUWFioI0eO6P7771d2drZq1aqlnj17asKECZU9XQDljMtYAADAaFzGAgAARiPsAAAAoxF2AACA0Qg7AADAaIQdAABgNMIOAAAwGmEHAAAYjbADAACMRtgBAABGI+wAAACjEXYAAIDRCDsAAMBo/x9dWkm/NZ32CAAAAABJRU5ErkJggg==",
      "text/plain": [
       "<Figure size 640x480 with 1 Axes>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "# plot the target variable \"class\"\n",
    "p = sns.histplot(train[\"class\"], ec=\"w\", lw=4)\n",
    "_ = p.set_title(\"Bad vs. Good Loan Count\")\n",
    "_ = p.spines[\"top\"].set_visible(False)\n",
    "_ = p.spines[\"right\"].set_visible(False)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "c6a697a5-5709-4a69-b644-62779b4f8bc5",
   "metadata": {},
   "source": [
    "Now, view the first few records of the context data."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "id": "79424785-129d-4007-84a5-041b6d38457d",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>ID</th>\n",
       "      <th>class</th>\n",
       "      <th>outcome_timestamp</th>\n",
       "      <th>duration</th>\n",
       "      <th>credit_amount</th>\n",
       "      <th>installment_commitment</th>\n",
       "      <th>checking_status</th>\n",
       "      <th>residence_since</th>\n",
       "      <th>age</th>\n",
       "      <th>existing_credits</th>\n",
       "      <th>num_dependents</th>\n",
       "      <th>housing</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>18</th>\n",
       "      <td>473</td>\n",
       "      <td>good</td>\n",
       "      <td>2023-12-16 03:29:12+00:00</td>\n",
       "      <td>6</td>\n",
       "      <td>1238</td>\n",
       "      <td>4</td>\n",
       "      <td>no checking</td>\n",
       "      <td>4</td>\n",
       "      <td>36</td>\n",
       "      <td>1</td>\n",
       "      <td>2</td>\n",
       "      <td>own</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>764</th>\n",
       "      <td>894</td>\n",
       "      <td>good</td>\n",
       "      <td>2023-11-15 23:19:35+00:00</td>\n",
       "      <td>18</td>\n",
       "      <td>1169</td>\n",
       "      <td>4</td>\n",
       "      <td>no checking</td>\n",
       "      <td>3</td>\n",
       "      <td>29</td>\n",
       "      <td>2</td>\n",
       "      <td>1</td>\n",
       "      <td>own</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>504</th>\n",
       "      <td>318</td>\n",
       "      <td>good</td>\n",
       "      <td>2023-11-23 13:03:53+00:00</td>\n",
       "      <td>12</td>\n",
       "      <td>701</td>\n",
       "      <td>4</td>\n",
       "      <td>no checking</td>\n",
       "      <td>2</td>\n",
       "      <td>32</td>\n",
       "      <td>2</td>\n",
       "      <td>1</td>\n",
       "      <td>own</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>454</th>\n",
       "      <td>340</td>\n",
       "      <td>good</td>\n",
       "      <td>2023-12-26 17:59:37+00:00</td>\n",
       "      <td>24</td>\n",
       "      <td>5743</td>\n",
       "      <td>2</td>\n",
       "      <td>0&lt;=X&lt;200</td>\n",
       "      <td>4</td>\n",
       "      <td>24</td>\n",
       "      <td>2</td>\n",
       "      <td>1</td>\n",
       "      <td>for free</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>453</th>\n",
       "      <td>605</td>\n",
       "      <td>good</td>\n",
       "      <td>2023-12-18 11:27:02+00:00</td>\n",
       "      <td>24</td>\n",
       "      <td>2828</td>\n",
       "      <td>4</td>\n",
       "      <td>&lt;0</td>\n",
       "      <td>4</td>\n",
       "      <td>22</td>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "      <td>own</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "      ID class         outcome_timestamp  duration  credit_amount  \\\n",
       "18   473  good 2023-12-16 03:29:12+00:00         6           1238   \n",
       "764  894  good 2023-11-15 23:19:35+00:00        18           1169   \n",
       "504  318  good 2023-11-23 13:03:53+00:00        12            701   \n",
       "454  340  good 2023-12-26 17:59:37+00:00        24           5743   \n",
       "453  605  good 2023-12-18 11:27:02+00:00        24           2828   \n",
       "\n",
       "     installment_commitment checking_status  residence_since  age  \\\n",
       "18                        4     no checking                4   36   \n",
       "764                       4     no checking                3   29   \n",
       "504                       4     no checking                2   32   \n",
       "454                       2        0<=X<200                4   24   \n",
       "453                       4              <0                4   22   \n",
       "\n",
       "     existing_credits  num_dependents   housing  \n",
       "18                  1               2       own  \n",
       "764                 2               1       own  \n",
       "504                 2               1       own  \n",
       "454                 2               1  for free  \n",
       "453                 1               1       own  "
      ]
     },
     "execution_count": 11,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# View first records in training data\n",
    "train.head()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "fd52f5bc-aa0f-48db-b356-c52aa7ce3724",
   "metadata": {},
   "source": [
    "### Feature Engineering"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "3e5b5c02-ad4d-400e-bdac-bfdf2799f575",
   "metadata": {},
   "source": [
    "Once data columns have been prepared so that they can be used to train an AI model, it is common to refer to them as \"features\". The process of preparing features is referred to as \"feature engineering\". \n",
    "\n",
    "Below, we will train a random forest model. Random forests are relatively robust to non-standardized, non-normalized data, making it easier for us to getting started. As such, the numerical columns are ready for a simple baseline training. \n",
    "\n",
    "We have pulled two categorical columns, wich we will need to engineer into numerical features."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "45a6fb27-140c-4f5a-b464-1f5e5d81d086",
   "metadata": {},
   "source": [
    "The `checking_status` column tells us roughly how much money the applicant has in their checking account, while the `housing` column shows the applicant's housing status. We presume that more money in checking correlates inversely with credit risk, while owing vs. renting, vs. living for free correlates directly with credit risk. Hence, converting these to ordinal features makes sense. Of course, in a real study we would want to quantitatively verify these presumptions."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "id": "9e374096-b02d-4cbb-8fca-dcc451c90c50",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "checking_status\n",
       "no checking    0.39375\n",
       "0<=X<200       0.27500\n",
       "<0             0.26125\n",
       ">=200          0.07000\n",
       "Name: proportion, dtype: float64"
      ]
     },
     "execution_count": 12,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# Inspect the `checking_status` column distibution\n",
    "train.checking_status.value_counts(normalize=True)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "id": "0144b525-244b-4526-8e4b-d393cb174d06",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "housing\n",
       "own         0.7225\n",
       "rent        0.1675\n",
       "for free    0.1100\n",
       "Name: proportion, dtype: float64"
      ]
     },
     "execution_count": 13,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# Inspect the `housing` column distribution\n",
    "train.housing.value_counts(normalize=True)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "2cb340b4-7d21-4810-8be2-1633da2e4396",
   "metadata": {},
   "source": [
    "We define a tranformer that can be used to convert `checking_status` and `housing` to ordinal variables. The transformer will also drop the non-feature columns (`class`, `ID`, and `application_timestamp`) from the feature data."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "id": "27796e23-c12e-4e51-8fb4-090b26aff2ef",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Feature lists\n",
    "cat_features = [\"checking_status\", \"housing\"]\n",
    "num_features = [\n",
    "    \"duration\", \"credit_amount\", \"installment_commitment\",\n",
    "    \"residence_since\", \"age\", \"existing_credits\", \"num_dependents\"\n",
    "]\n",
    "\n",
    "# Ordinal encoder for cat_features\n",
    "# (We use a ColumnTransformer to passthrough numerical feature columns)\n",
    "col_transform = ColumnTransformer([\n",
    "        (\"cat_features\", OrdinalEncoder(), cat_features),\n",
    "        (\"num_features\", \"passthrough\", num_features),\n",
    "    ],\n",
    "    remainder=\"drop\",\n",
    ")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "id": "318429b9-e008-4cc7-8108-779934f9ac2f",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>checking_status</th>\n",
       "      <th>housing</th>\n",
       "      <th>duration</th>\n",
       "      <th>credit_amount</th>\n",
       "      <th>installment_commitment</th>\n",
       "      <th>residence_since</th>\n",
       "      <th>age</th>\n",
       "      <th>existing_credits</th>\n",
       "      <th>num_dependents</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>18</th>\n",
       "      <td>3.0</td>\n",
       "      <td>1.0</td>\n",
       "      <td>6.0</td>\n",
       "      <td>1238.0</td>\n",
       "      <td>4.0</td>\n",
       "      <td>4.0</td>\n",
       "      <td>36.0</td>\n",
       "      <td>1.0</td>\n",
       "      <td>2.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>764</th>\n",
       "      <td>3.0</td>\n",
       "      <td>1.0</td>\n",
       "      <td>18.0</td>\n",
       "      <td>1169.0</td>\n",
       "      <td>4.0</td>\n",
       "      <td>3.0</td>\n",
       "      <td>29.0</td>\n",
       "      <td>2.0</td>\n",
       "      <td>1.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>504</th>\n",
       "      <td>3.0</td>\n",
       "      <td>1.0</td>\n",
       "      <td>12.0</td>\n",
       "      <td>701.0</td>\n",
       "      <td>4.0</td>\n",
       "      <td>2.0</td>\n",
       "      <td>32.0</td>\n",
       "      <td>2.0</td>\n",
       "      <td>1.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>454</th>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>24.0</td>\n",
       "      <td>5743.0</td>\n",
       "      <td>2.0</td>\n",
       "      <td>4.0</td>\n",
       "      <td>24.0</td>\n",
       "      <td>2.0</td>\n",
       "      <td>1.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>453</th>\n",
       "      <td>1.0</td>\n",
       "      <td>1.0</td>\n",
       "      <td>24.0</td>\n",
       "      <td>2828.0</td>\n",
       "      <td>4.0</td>\n",
       "      <td>4.0</td>\n",
       "      <td>22.0</td>\n",
       "      <td>1.0</td>\n",
       "      <td>1.0</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "     checking_status  housing  duration  credit_amount  \\\n",
       "18               3.0      1.0       6.0         1238.0   \n",
       "764              3.0      1.0      18.0         1169.0   \n",
       "504              3.0      1.0      12.0          701.0   \n",
       "454              0.0      0.0      24.0         5743.0   \n",
       "453              1.0      1.0      24.0         2828.0   \n",
       "\n",
       "     installment_commitment  residence_since   age  existing_credits  \\\n",
       "18                      4.0              4.0  36.0               1.0   \n",
       "764                     4.0              3.0  29.0               2.0   \n",
       "504                     4.0              2.0  32.0               2.0   \n",
       "454                     2.0              4.0  24.0               2.0   \n",
       "453                     4.0              4.0  22.0               1.0   \n",
       "\n",
       "     num_dependents  \n",
       "18              2.0  \n",
       "764             1.0  \n",
       "504             1.0  \n",
       "454             1.0  \n",
       "453             1.0  "
      ]
     },
     "execution_count": 15,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# Check the tranform outputs features as expected\n",
    "# (Note: transform output is an array, so we convert it\n",
    "# back to dataframe for inspection)\n",
    "pd.DataFrame(\n",
    "    index=train.index,\n",
    "    columns=cat_features + num_features,\n",
    "    data= col_transform.fit_transform(train)\n",
    ").head()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "a3785c93-8830-4fa2-bb9d-31b6e8fecb01",
   "metadata": {},
   "source": [
    "Finally, let's separate out the labels, and engineer them from categorical (\"good\" | \"bad\") to float (1.0 | 0.0). We do this for both the training and validation data."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 16,
   "id": "30ebff90-a193-43a2-86fb-cf09e7d03777",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Make \"class\" target variable numeric\n",
    "train_y = (train[\"class\"] == \"good\").astype(float)\n",
    "validate_y = (validate[\"class\"] == \"good\").astype(float)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "b052f6b2-2a34-441d-8a5f-2aad4e4db022",
   "metadata": {},
   "source": [
    "### Train the Model"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "c4f14590-31f4-4680-b1a1-75755a78513e",
   "metadata": {},
   "source": [
    "Now that the features are prepared, we can train (fit) our baseline model on the feature data."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 17,
   "id": "0ff48f34-dbb6-4221-aefc-3c9b3f9da3e3",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Specify the model\n",
    "rf_model = RandomForestClassifier(\n",
    "    n_estimators=400,\n",
    "    criterion=\"entropy\",\n",
    "    max_depth=4,\n",
    "    min_samples_leaf=10,\n",
    "    class_weight={0:5, 1:1},\n",
    "    random_state=SEED\n",
    ")\n",
    "\n",
    "# Package transform and model in pipeline\n",
    "model = Pipeline([(\"transform\", col_transform), (\"rf_model\", rf_model)])"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 18,
   "id": "1d6ef38a-23b0-4056-a108-960495521164",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<style>#sk-container-id-1 {\n",
       "  /* Definition of color scheme common for light and dark mode */\n",
       "  --sklearn-color-text: black;\n",
       "  --sklearn-color-line: gray;\n",
       "  /* Definition of color scheme for unfitted estimators */\n",
       "  --sklearn-color-unfitted-level-0: #fff5e6;\n",
       "  --sklearn-color-unfitted-level-1: #f6e4d2;\n",
       "  --sklearn-color-unfitted-level-2: #ffe0b3;\n",
       "  --sklearn-color-unfitted-level-3: chocolate;\n",
       "  /* Definition of color scheme for fitted estimators */\n",
       "  --sklearn-color-fitted-level-0: #f0f8ff;\n",
       "  --sklearn-color-fitted-level-1: #d4ebff;\n",
       "  --sklearn-color-fitted-level-2: #b3dbfd;\n",
       "  --sklearn-color-fitted-level-3: cornflowerblue;\n",
       "\n",
       "  /* Specific color for light theme */\n",
       "  --sklearn-color-text-on-default-background: var(--sg-text-color, var(--theme-code-foreground, var(--jp-content-font-color1, black)));\n",
       "  --sklearn-color-background: var(--sg-background-color, var(--theme-background, var(--jp-layout-color0, white)));\n",
       "  --sklearn-color-border-box: var(--sg-text-color, var(--theme-code-foreground, var(--jp-content-font-color1, black)));\n",
       "  --sklearn-color-icon: #696969;\n",
       "\n",
       "  @media (prefers-color-scheme: dark) {\n",
       "    /* Redefinition of color scheme for dark theme */\n",
       "    --sklearn-color-text-on-default-background: var(--sg-text-color, var(--theme-code-foreground, var(--jp-content-font-color1, white)));\n",
       "    --sklearn-color-background: var(--sg-background-color, var(--theme-background, var(--jp-layout-color0, #111)));\n",
       "    --sklearn-color-border-box: var(--sg-text-color, var(--theme-code-foreground, var(--jp-content-font-color1, white)));\n",
       "    --sklearn-color-icon: #878787;\n",
       "  }\n",
       "}\n",
       "\n",
       "#sk-container-id-1 {\n",
       "  color: var(--sklearn-color-text);\n",
       "}\n",
       "\n",
       "#sk-container-id-1 pre {\n",
       "  padding: 0;\n",
       "}\n",
       "\n",
       "#sk-container-id-1 input.sk-hidden--visually {\n",
       "  border: 0;\n",
       "  clip: rect(1px 1px 1px 1px);\n",
       "  clip: rect(1px, 1px, 1px, 1px);\n",
       "  height: 1px;\n",
       "  margin: -1px;\n",
       "  overflow: hidden;\n",
       "  padding: 0;\n",
       "  position: absolute;\n",
       "  width: 1px;\n",
       "}\n",
       "\n",
       "#sk-container-id-1 div.sk-dashed-wrapped {\n",
       "  border: 1px dashed var(--sklearn-color-line);\n",
       "  margin: 0 0.4em 0.5em 0.4em;\n",
       "  box-sizing: border-box;\n",
       "  padding-bottom: 0.4em;\n",
       "  background-color: var(--sklearn-color-background);\n",
       "}\n",
       "\n",
       "#sk-container-id-1 div.sk-container {\n",
       "  /* jupyter's `normalize.less` sets `[hidden] { display: none; }`\n",
       "     but bootstrap.min.css set `[hidden] { display: none !important; }`\n",
       "     so we also need the `!important` here to be able to override the\n",
       "     default hidden behavior on the sphinx rendered scikit-learn.org.\n",
       "     See: https://github.com/scikit-learn/scikit-learn/issues/21755 */\n",
       "  display: inline-block !important;\n",
       "  position: relative;\n",
       "}\n",
       "\n",
       "#sk-container-id-1 div.sk-text-repr-fallback {\n",
       "  display: none;\n",
       "}\n",
       "\n",
       "div.sk-parallel-item,\n",
       "div.sk-serial,\n",
       "div.sk-item {\n",
       "  /* draw centered vertical line to link estimators */\n",
       "  background-image: linear-gradient(var(--sklearn-color-text-on-default-background), var(--sklearn-color-text-on-default-background));\n",
       "  background-size: 2px 100%;\n",
       "  background-repeat: no-repeat;\n",
       "  background-position: center center;\n",
       "}\n",
       "\n",
       "/* Parallel-specific style estimator block */\n",
       "\n",
       "#sk-container-id-1 div.sk-parallel-item::after {\n",
       "  content: \"\";\n",
       "  width: 100%;\n",
       "  border-bottom: 2px solid var(--sklearn-color-text-on-default-background);\n",
       "  flex-grow: 1;\n",
       "}\n",
       "\n",
       "#sk-container-id-1 div.sk-parallel {\n",
       "  display: flex;\n",
       "  align-items: stretch;\n",
       "  justify-content: center;\n",
       "  background-color: var(--sklearn-color-background);\n",
       "  position: relative;\n",
       "}\n",
       "\n",
       "#sk-container-id-1 div.sk-parallel-item {\n",
       "  display: flex;\n",
       "  flex-direction: column;\n",
       "}\n",
       "\n",
       "#sk-container-id-1 div.sk-parallel-item:first-child::after {\n",
       "  align-self: flex-end;\n",
       "  width: 50%;\n",
       "}\n",
       "\n",
       "#sk-container-id-1 div.sk-parallel-item:last-child::after {\n",
       "  align-self: flex-start;\n",
       "  width: 50%;\n",
       "}\n",
       "\n",
       "#sk-container-id-1 div.sk-parallel-item:only-child::after {\n",
       "  width: 0;\n",
       "}\n",
       "\n",
       "/* Serial-specific style estimator block */\n",
       "\n",
       "#sk-container-id-1 div.sk-serial {\n",
       "  display: flex;\n",
       "  flex-direction: column;\n",
       "  align-items: center;\n",
       "  background-color: var(--sklearn-color-background);\n",
       "  padding-right: 1em;\n",
       "  padding-left: 1em;\n",
       "}\n",
       "\n",
       "\n",
       "/* Toggleable style: style used for estimator/Pipeline/ColumnTransformer box that is\n",
       "clickable and can be expanded/collapsed.\n",
       "- Pipeline and ColumnTransformer use this feature and define the default style\n",
       "- Estimators will overwrite some part of the style using the `sk-estimator` class\n",
       "*/\n",
       "\n",
       "/* Pipeline and ColumnTransformer style (default) */\n",
       "\n",
       "#sk-container-id-1 div.sk-toggleable {\n",
       "  /* Default theme specific background. It is overwritten whether we have a\n",
       "  specific estimator or a Pipeline/ColumnTransformer */\n",
       "  background-color: var(--sklearn-color-background);\n",
       "}\n",
       "\n",
       "/* Toggleable label */\n",
       "#sk-container-id-1 label.sk-toggleable__label {\n",
       "  cursor: pointer;\n",
       "  display: block;\n",
       "  width: 100%;\n",
       "  margin-bottom: 0;\n",
       "  padding: 0.5em;\n",
       "  box-sizing: border-box;\n",
       "  text-align: center;\n",
       "}\n",
       "\n",
       "#sk-container-id-1 label.sk-toggleable__label-arrow:before {\n",
       "  /* Arrow on the left of the label */\n",
       "  content: \"▸\";\n",
       "  float: left;\n",
       "  margin-right: 0.25em;\n",
       "  color: var(--sklearn-color-icon);\n",
       "}\n",
       "\n",
       "#sk-container-id-1 label.sk-toggleable__label-arrow:hover:before {\n",
       "  color: var(--sklearn-color-text);\n",
       "}\n",
       "\n",
       "/* Toggleable content - dropdown */\n",
       "\n",
       "#sk-container-id-1 div.sk-toggleable__content {\n",
       "  max-height: 0;\n",
       "  max-width: 0;\n",
       "  overflow: hidden;\n",
       "  text-align: left;\n",
       "  /* unfitted */\n",
       "  background-color: var(--sklearn-color-unfitted-level-0);\n",
       "}\n",
       "\n",
       "#sk-container-id-1 div.sk-toggleable__content.fitted {\n",
       "  /* fitted */\n",
       "  background-color: var(--sklearn-color-fitted-level-0);\n",
       "}\n",
       "\n",
       "#sk-container-id-1 div.sk-toggleable__content pre {\n",
       "  margin: 0.2em;\n",
       "  border-radius: 0.25em;\n",
       "  color: var(--sklearn-color-text);\n",
       "  /* unfitted */\n",
       "  background-color: var(--sklearn-color-unfitted-level-0);\n",
       "}\n",
       "\n",
       "#sk-container-id-1 div.sk-toggleable__content.fitted pre {\n",
       "  /* unfitted */\n",
       "  background-color: var(--sklearn-color-fitted-level-0);\n",
       "}\n",
       "\n",
       "#sk-container-id-1 input.sk-toggleable__control:checked~div.sk-toggleable__content {\n",
       "  /* Expand drop-down */\n",
       "  max-height: 200px;\n",
       "  max-width: 100%;\n",
       "  overflow: auto;\n",
       "}\n",
       "\n",
       "#sk-container-id-1 input.sk-toggleable__control:checked~label.sk-toggleable__label-arrow:before {\n",
       "  content: \"▾\";\n",
       "}\n",
       "\n",
       "/* Pipeline/ColumnTransformer-specific style */\n",
       "\n",
       "#sk-container-id-1 div.sk-label input.sk-toggleable__control:checked~label.sk-toggleable__label {\n",
       "  color: var(--sklearn-color-text);\n",
       "  background-color: var(--sklearn-color-unfitted-level-2);\n",
       "}\n",
       "\n",
       "#sk-container-id-1 div.sk-label.fitted input.sk-toggleable__control:checked~label.sk-toggleable__label {\n",
       "  background-color: var(--sklearn-color-fitted-level-2);\n",
       "}\n",
       "\n",
       "/* Estimator-specific style */\n",
       "\n",
       "/* Colorize estimator box */\n",
       "#sk-container-id-1 div.sk-estimator input.sk-toggleable__control:checked~label.sk-toggleable__label {\n",
       "  /* unfitted */\n",
       "  background-color: var(--sklearn-color-unfitted-level-2);\n",
       "}\n",
       "\n",
       "#sk-container-id-1 div.sk-estimator.fitted input.sk-toggleable__control:checked~label.sk-toggleable__label {\n",
       "  /* fitted */\n",
       "  background-color: var(--sklearn-color-fitted-level-2);\n",
       "}\n",
       "\n",
       "#sk-container-id-1 div.sk-label label.sk-toggleable__label,\n",
       "#sk-container-id-1 div.sk-label label {\n",
       "  /* The background is the default theme color */\n",
       "  color: var(--sklearn-color-text-on-default-background);\n",
       "}\n",
       "\n",
       "/* On hover, darken the color of the background */\n",
       "#sk-container-id-1 div.sk-label:hover label.sk-toggleable__label {\n",
       "  color: var(--sklearn-color-text);\n",
       "  background-color: var(--sklearn-color-unfitted-level-2);\n",
       "}\n",
       "\n",
       "/* Label box, darken color on hover, fitted */\n",
       "#sk-container-id-1 div.sk-label.fitted:hover label.sk-toggleable__label.fitted {\n",
       "  color: var(--sklearn-color-text);\n",
       "  background-color: var(--sklearn-color-fitted-level-2);\n",
       "}\n",
       "\n",
       "/* Estimator label */\n",
       "\n",
       "#sk-container-id-1 div.sk-label label {\n",
       "  font-family: monospace;\n",
       "  font-weight: bold;\n",
       "  display: inline-block;\n",
       "  line-height: 1.2em;\n",
       "}\n",
       "\n",
       "#sk-container-id-1 div.sk-label-container {\n",
       "  text-align: center;\n",
       "}\n",
       "\n",
       "/* Estimator-specific */\n",
       "#sk-container-id-1 div.sk-estimator {\n",
       "  font-family: monospace;\n",
       "  border: 1px dotted var(--sklearn-color-border-box);\n",
       "  border-radius: 0.25em;\n",
       "  box-sizing: border-box;\n",
       "  margin-bottom: 0.5em;\n",
       "  /* unfitted */\n",
       "  background-color: var(--sklearn-color-unfitted-level-0);\n",
       "}\n",
       "\n",
       "#sk-container-id-1 div.sk-estimator.fitted {\n",
       "  /* fitted */\n",
       "  background-color: var(--sklearn-color-fitted-level-0);\n",
       "}\n",
       "\n",
       "/* on hover */\n",
       "#sk-container-id-1 div.sk-estimator:hover {\n",
       "  /* unfitted */\n",
       "  background-color: var(--sklearn-color-unfitted-level-2);\n",
       "}\n",
       "\n",
       "#sk-container-id-1 div.sk-estimator.fitted:hover {\n",
       "  /* fitted */\n",
       "  background-color: var(--sklearn-color-fitted-level-2);\n",
       "}\n",
       "\n",
       "/* Specification for estimator info (e.g. \"i\" and \"?\") */\n",
       "\n",
       "/* Common style for \"i\" and \"?\" */\n",
       "\n",
       ".sk-estimator-doc-link,\n",
       "a:link.sk-estimator-doc-link,\n",
       "a:visited.sk-estimator-doc-link {\n",
       "  float: right;\n",
       "  font-size: smaller;\n",
       "  line-height: 1em;\n",
       "  font-family: monospace;\n",
       "  background-color: var(--sklearn-color-background);\n",
       "  border-radius: 1em;\n",
       "  height: 1em;\n",
       "  width: 1em;\n",
       "  text-decoration: none !important;\n",
       "  margin-left: 1ex;\n",
       "  /* unfitted */\n",
       "  border: var(--sklearn-color-unfitted-level-1) 1pt solid;\n",
       "  color: var(--sklearn-color-unfitted-level-1);\n",
       "}\n",
       "\n",
       ".sk-estimator-doc-link.fitted,\n",
       "a:link.sk-estimator-doc-link.fitted,\n",
       "a:visited.sk-estimator-doc-link.fitted {\n",
       "  /* fitted */\n",
       "  border: var(--sklearn-color-fitted-level-1) 1pt solid;\n",
       "  color: var(--sklearn-color-fitted-level-1);\n",
       "}\n",
       "\n",
       "/* On hover */\n",
       "div.sk-estimator:hover .sk-estimator-doc-link:hover,\n",
       ".sk-estimator-doc-link:hover,\n",
       "div.sk-label-container:hover .sk-estimator-doc-link:hover,\n",
       ".sk-estimator-doc-link:hover {\n",
       "  /* unfitted */\n",
       "  background-color: var(--sklearn-color-unfitted-level-3);\n",
       "  color: var(--sklearn-color-background);\n",
       "  text-decoration: none;\n",
       "}\n",
       "\n",
       "div.sk-estimator.fitted:hover .sk-estimator-doc-link.fitted:hover,\n",
       ".sk-estimator-doc-link.fitted:hover,\n",
       "div.sk-label-container:hover .sk-estimator-doc-link.fitted:hover,\n",
       ".sk-estimator-doc-link.fitted:hover {\n",
       "  /* fitted */\n",
       "  background-color: var(--sklearn-color-fitted-level-3);\n",
       "  color: var(--sklearn-color-background);\n",
       "  text-decoration: none;\n",
       "}\n",
       "\n",
       "/* Span, style for the box shown on hovering the info icon */\n",
       ".sk-estimator-doc-link span {\n",
       "  display: none;\n",
       "  z-index: 9999;\n",
       "  position: relative;\n",
       "  font-weight: normal;\n",
       "  right: .2ex;\n",
       "  padding: .5ex;\n",
       "  margin: .5ex;\n",
       "  width: min-content;\n",
       "  min-width: 20ex;\n",
       "  max-width: 50ex;\n",
       "  color: var(--sklearn-color-text);\n",
       "  box-shadow: 2pt 2pt 4pt #999;\n",
       "  /* unfitted */\n",
       "  background: var(--sklearn-color-unfitted-level-0);\n",
       "  border: .5pt solid var(--sklearn-color-unfitted-level-3);\n",
       "}\n",
       "\n",
       ".sk-estimator-doc-link.fitted span {\n",
       "  /* fitted */\n",
       "  background: var(--sklearn-color-fitted-level-0);\n",
       "  border: var(--sklearn-color-fitted-level-3);\n",
       "}\n",
       "\n",
       ".sk-estimator-doc-link:hover span {\n",
       "  display: block;\n",
       "}\n",
       "\n",
       "/* \"?\"-specific style due to the `<a>` HTML tag */\n",
       "\n",
       "#sk-container-id-1 a.estimator_doc_link {\n",
       "  float: right;\n",
       "  font-size: 1rem;\n",
       "  line-height: 1em;\n",
       "  font-family: monospace;\n",
       "  background-color: var(--sklearn-color-background);\n",
       "  border-radius: 1rem;\n",
       "  height: 1rem;\n",
       "  width: 1rem;\n",
       "  text-decoration: none;\n",
       "  /* unfitted */\n",
       "  color: var(--sklearn-color-unfitted-level-1);\n",
       "  border: var(--sklearn-color-unfitted-level-1) 1pt solid;\n",
       "}\n",
       "\n",
       "#sk-container-id-1 a.estimator_doc_link.fitted {\n",
       "  /* fitted */\n",
       "  border: var(--sklearn-color-fitted-level-1) 1pt solid;\n",
       "  color: var(--sklearn-color-fitted-level-1);\n",
       "}\n",
       "\n",
       "/* On hover */\n",
       "#sk-container-id-1 a.estimator_doc_link:hover {\n",
       "  /* unfitted */\n",
       "  background-color: var(--sklearn-color-unfitted-level-3);\n",
       "  color: var(--sklearn-color-background);\n",
       "  text-decoration: none;\n",
       "}\n",
       "\n",
       "#sk-container-id-1 a.estimator_doc_link.fitted:hover {\n",
       "  /* fitted */\n",
       "  background-color: var(--sklearn-color-fitted-level-3);\n",
       "}\n",
       "</style><div id=\"sk-container-id-1\" class=\"sk-top-container\"><div class=\"sk-text-repr-fallback\"><pre>Pipeline(steps=[(&#x27;transform&#x27;,\n",
       "                 ColumnTransformer(transformers=[(&#x27;cat_features&#x27;,\n",
       "                                                  OrdinalEncoder(),\n",
       "                                                  [&#x27;checking_status&#x27;,\n",
       "                                                   &#x27;housing&#x27;]),\n",
       "                                                 (&#x27;num_features&#x27;, &#x27;passthrough&#x27;,\n",
       "                                                  [&#x27;duration&#x27;, &#x27;credit_amount&#x27;,\n",
       "                                                   &#x27;installment_commitment&#x27;,\n",
       "                                                   &#x27;residence_since&#x27;, &#x27;age&#x27;,\n",
       "                                                   &#x27;existing_credits&#x27;,\n",
       "                                                   &#x27;num_dependents&#x27;])])),\n",
       "                (&#x27;rf_model&#x27;,\n",
       "                 RandomForestClassifier(class_weight={0: 5, 1: 1},\n",
       "                                        criterion=&#x27;entropy&#x27;, max_depth=4,\n",
       "                                        min_samples_leaf=10, n_estimators=400,\n",
       "                                        random_state=142))])</pre><b>In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook. <br />On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.</b></div><div class=\"sk-container\" hidden><div class=\"sk-item sk-dashed-wrapped\"><div class=\"sk-label-container\"><div class=\"sk-label fitted sk-toggleable\"><input class=\"sk-toggleable__control sk-hidden--visually\" id=\"sk-estimator-id-1\" type=\"checkbox\" ><label for=\"sk-estimator-id-1\" class=\"sk-toggleable__label fitted sk-toggleable__label-arrow fitted\">&nbsp;&nbsp;Pipeline<a class=\"sk-estimator-doc-link fitted\" rel=\"noreferrer\" target=\"_blank\" href=\"https://scikit-learn.org/1.5/modules/generated/sklearn.pipeline.Pipeline.html\">?<span>Documentation for Pipeline</span></a><span class=\"sk-estimator-doc-link fitted\">i<span>Fitted</span></span></label><div class=\"sk-toggleable__content fitted\"><pre>Pipeline(steps=[(&#x27;transform&#x27;,\n",
       "                 ColumnTransformer(transformers=[(&#x27;cat_features&#x27;,\n",
       "                                                  OrdinalEncoder(),\n",
       "                                                  [&#x27;checking_status&#x27;,\n",
       "                                                   &#x27;housing&#x27;]),\n",
       "                                                 (&#x27;num_features&#x27;, &#x27;passthrough&#x27;,\n",
       "                                                  [&#x27;duration&#x27;, &#x27;credit_amount&#x27;,\n",
       "                                                   &#x27;installment_commitment&#x27;,\n",
       "                                                   &#x27;residence_since&#x27;, &#x27;age&#x27;,\n",
       "                                                   &#x27;existing_credits&#x27;,\n",
       "                                                   &#x27;num_dependents&#x27;])])),\n",
       "                (&#x27;rf_model&#x27;,\n",
       "                 RandomForestClassifier(class_weight={0: 5, 1: 1},\n",
       "                                        criterion=&#x27;entropy&#x27;, max_depth=4,\n",
       "                                        min_samples_leaf=10, n_estimators=400,\n",
       "                                        random_state=142))])</pre></div> </div></div><div class=\"sk-serial\"><div class=\"sk-item sk-dashed-wrapped\"><div class=\"sk-label-container\"><div class=\"sk-label fitted sk-toggleable\"><input class=\"sk-toggleable__control sk-hidden--visually\" id=\"sk-estimator-id-2\" type=\"checkbox\" ><label for=\"sk-estimator-id-2\" class=\"sk-toggleable__label fitted sk-toggleable__label-arrow fitted\">&nbsp;transform: ColumnTransformer<a class=\"sk-estimator-doc-link fitted\" rel=\"noreferrer\" target=\"_blank\" href=\"https://scikit-learn.org/1.5/modules/generated/sklearn.compose.ColumnTransformer.html\">?<span>Documentation for transform: ColumnTransformer</span></a></label><div class=\"sk-toggleable__content fitted\"><pre>ColumnTransformer(transformers=[(&#x27;cat_features&#x27;, OrdinalEncoder(),\n",
       "                                 [&#x27;checking_status&#x27;, &#x27;housing&#x27;]),\n",
       "                                (&#x27;num_features&#x27;, &#x27;passthrough&#x27;,\n",
       "                                 [&#x27;duration&#x27;, &#x27;credit_amount&#x27;,\n",
       "                                  &#x27;installment_commitment&#x27;, &#x27;residence_since&#x27;,\n",
       "                                  &#x27;age&#x27;, &#x27;existing_credits&#x27;,\n",
       "                                  &#x27;num_dependents&#x27;])])</pre></div> </div></div><div class=\"sk-parallel\"><div class=\"sk-parallel-item\"><div class=\"sk-item\"><div class=\"sk-label-container\"><div class=\"sk-label fitted sk-toggleable\"><input class=\"sk-toggleable__control sk-hidden--visually\" id=\"sk-estimator-id-3\" type=\"checkbox\" ><label for=\"sk-estimator-id-3\" class=\"sk-toggleable__label fitted sk-toggleable__label-arrow fitted\">cat_features</label><div class=\"sk-toggleable__content fitted\"><pre>[&#x27;checking_status&#x27;, &#x27;housing&#x27;]</pre></div> </div></div><div class=\"sk-serial\"><div class=\"sk-item\"><div class=\"sk-estimator fitted sk-toggleable\"><input class=\"sk-toggleable__control sk-hidden--visually\" id=\"sk-estimator-id-4\" type=\"checkbox\" ><label for=\"sk-estimator-id-4\" class=\"sk-toggleable__label fitted sk-toggleable__label-arrow fitted\">&nbsp;OrdinalEncoder<a class=\"sk-estimator-doc-link fitted\" rel=\"noreferrer\" target=\"_blank\" href=\"https://scikit-learn.org/1.5/modules/generated/sklearn.preprocessing.OrdinalEncoder.html\">?<span>Documentation for OrdinalEncoder</span></a></label><div class=\"sk-toggleable__content fitted\"><pre>OrdinalEncoder()</pre></div> </div></div></div></div></div><div class=\"sk-parallel-item\"><div class=\"sk-item\"><div class=\"sk-label-container\"><div class=\"sk-label fitted sk-toggleable\"><input class=\"sk-toggleable__control sk-hidden--visually\" id=\"sk-estimator-id-5\" type=\"checkbox\" ><label for=\"sk-estimator-id-5\" class=\"sk-toggleable__label fitted sk-toggleable__label-arrow fitted\">num_features</label><div class=\"sk-toggleable__content fitted\"><pre>[&#x27;duration&#x27;, &#x27;credit_amount&#x27;, &#x27;installment_commitment&#x27;, &#x27;residence_since&#x27;, &#x27;age&#x27;, &#x27;existing_credits&#x27;, &#x27;num_dependents&#x27;]</pre></div> </div></div><div class=\"sk-serial\"><div class=\"sk-item\"><div class=\"sk-estimator fitted sk-toggleable\"><input class=\"sk-toggleable__control sk-hidden--visually\" id=\"sk-estimator-id-6\" type=\"checkbox\" ><label for=\"sk-estimator-id-6\" class=\"sk-toggleable__label fitted sk-toggleable__label-arrow fitted\">passthrough</label><div class=\"sk-toggleable__content fitted\"><pre>passthrough</pre></div> </div></div></div></div></div></div></div><div class=\"sk-item\"><div class=\"sk-estimator fitted sk-toggleable\"><input class=\"sk-toggleable__control sk-hidden--visually\" id=\"sk-estimator-id-7\" type=\"checkbox\" ><label for=\"sk-estimator-id-7\" class=\"sk-toggleable__label fitted sk-toggleable__label-arrow fitted\">&nbsp;RandomForestClassifier<a class=\"sk-estimator-doc-link fitted\" rel=\"noreferrer\" target=\"_blank\" href=\"https://scikit-learn.org/1.5/modules/generated/sklearn.ensemble.RandomForestClassifier.html\">?<span>Documentation for RandomForestClassifier</span></a></label><div class=\"sk-toggleable__content fitted\"><pre>RandomForestClassifier(class_weight={0: 5, 1: 1}, criterion=&#x27;entropy&#x27;,\n",
       "                       max_depth=4, min_samples_leaf=10, n_estimators=400,\n",
       "                       random_state=142)</pre></div> </div></div></div></div></div></div>"
      ],
      "text/plain": [
       "Pipeline(steps=[('transform',\n",
       "                 ColumnTransformer(transformers=[('cat_features',\n",
       "                                                  OrdinalEncoder(),\n",
       "                                                  ['checking_status',\n",
       "                                                   'housing']),\n",
       "                                                 ('num_features', 'passthrough',\n",
       "                                                  ['duration', 'credit_amount',\n",
       "                                                   'installment_commitment',\n",
       "                                                   'residence_since', 'age',\n",
       "                                                   'existing_credits',\n",
       "                                                   'num_dependents'])])),\n",
       "                ('rf_model',\n",
       "                 RandomForestClassifier(class_weight={0: 5, 1: 1},\n",
       "                                        criterion='entropy', max_depth=4,\n",
       "                                        min_samples_leaf=10, n_estimators=400,\n",
       "                                        random_state=142))])"
      ]
     },
     "execution_count": 18,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# Fit the model\n",
    "model.fit(train, train_y)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "73c45c39-9d8e-4f76-aca5-9f0c1568d263",
   "metadata": {},
   "source": [
    "### Evaluate the Model"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "ef58d432-80ba-428f-b59f-621a9e53b331",
   "metadata": {},
   "source": [
    "Let's evaluate our baseline model performance. With credit risk, recall is going to be an important measure to look at. We compare the performance on the training data, with the performance on the validation data through a classification report."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 19,
   "id": "8c5472f6-2ddc-437d-8102-4d5bd2c9f39c",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "              precision    recall  f1-score   support\n",
      "\n",
      "         0.0       0.42      0.92      0.58       232\n",
      "         1.0       0.94      0.49      0.64       568\n",
      "\n",
      "    accuracy                           0.61       800\n",
      "   macro avg       0.68      0.70      0.61       800\n",
      "weighted avg       0.79      0.61      0.63       800\n",
      "\n"
     ]
    }
   ],
   "source": [
    "# Evaluate training set performance\n",
    "train_preds = model.predict(train)\n",
    "print(classification_report(train_y, train_preds))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 20,
   "id": "c296bbd3-603e-4615-abbe-2689ebcf5d8c",
   "metadata": {
    "scrolled": true
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "              precision    recall  f1-score   support\n",
      "\n",
      "         0.0       0.46      0.87      0.61        68\n",
      "         1.0       0.88      0.48      0.62       132\n",
      "\n",
      "    accuracy                           0.61       200\n",
      "   macro avg       0.67      0.68      0.61       200\n",
      "weighted avg       0.74      0.61      0.62       200\n",
      "\n"
     ]
    }
   ],
   "source": [
    "# Evaluate validation data performance\n",
    "print(classification_report(validate_y, model.predict(validate)))"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "d57ffbdc-f0b3-4fb6-9575-5acd983082cf",
   "metadata": {},
   "source": [
    "The recall on the validation set for bad loans (0 class) is 0.87, meaning that the model correctly identified close to 90% of the bad loans. However, the precision of 0.46 tells us that the model is also classifying many loans that were actually good as bad. Precision and recall are technical metrics. In order to truly assess the models value, we would need feedback from the business side on the impact of misclassifications (for both good and bad loans).\n",
    "\n",
    "The difference in performance on the training vs. validation data, tells us that the model is slightly overfitting the data. Remember that this is just a quick baseline model. To improve further, we could do things like:\n",
    "- gather more data\n",
    "- engineer features\n",
    "- experiment with hyperparameter settings\n",
    "- experiment with other model types\n",
    "\n",
    "In fact, this is just a start. Creating AI models that meet business needs often requires a lot of guided experimentation."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "0378d21a-d6db-42f9-851a-ce71f68c6802",
   "metadata": {},
   "source": [
    "### Save the Model"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "4450a328-f00c-4579-8e08-b2ebe5046961",
   "metadata": {},
   "source": [
    "The last thing we do is save our trained model, so that we can pick it up later in the serving environment."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 21,
   "id": "da7a7906-d54f-4f2d-9803-6c82c86b28ad",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "['rf_model.pkl']"
      ]
     },
     "execution_count": 21,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# Save the model to a pickle file\n",
    "joblib.dump(model, \"rf_model.pkl\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "299588b8-ab67-4155-97a9-770e8e4a7476",
   "metadata": {},
   "source": [
    "In the next notebook, [04_Credit_Risk_Model_Serving.ipynb](04_Credit_Risk_Model_Serving.ipynb), we will load the trained model and request predictions, with input features provided by the Feast online feature server."
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.11.9"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}
